A Fully Automated Derivation of State-Based Eigentriphones for Triphone Modeling with No Tied States Using Regularization
نویسندگان
چکیده
Recently we proposed an alternative method called eigentriphone to solve the data insufficiency problem in triphone acoustic modeling without the need of state tying. The idea is to treat the acoustic modeling problem of infrequent triphones (“poor triphones”) as an adaptation problem from the more frequent triphones (“rich triphones”): firstly, an eigenbasis is developed over the rich triphones that have sufficient training data and the eigenvectors are called eigentriphones; then the poor triphones are adapted in a fashion similar to eigenvoice adaptation. Since, in general, no states are tied in our method, all triphones (states) are distinct so that they can be more discriminative than tied-state triphones. In our previous work, the number of eigentriphones was determined in advance with a set of development data. In this paper, we investigate simply using all of them with the help of regularization to naturally penalize the less important ones. In addition, the modelbased eigenbasis is replaced by three state-based eigenbases. Experimental evaluation on the WSJ 5K task shows that triphone models trained using our new eigentriphone approach without state tying perform at least as well as the common tied-state triphone models.
منابع مشابه
Distinct triphone acoustic modeling using deep neural networks
To strike a balance between robust parameter estimation and detailed modeling, most automatic speech recognition systems are built using tied-state continuous density hidden Markov models (CDHMM). Consequently, states that are tied together in a tied-state are not distinguishable, introducing quantization errors inevitably. It has been shown that it is possible to model (almost) all distinct tr...
متن کاملImproved Bayesian Training for Context-Dependent Modeling in Continuous Persian Speech Recognition
Context-dependent modeling is a widely used technique for better phone modeling in continuous speech recognition. While different types of context-dependent models have been used, triphones have been known as the most effective ones. In this paper, a Maximum a Posteriori (MAP) estimation approach has been used to estimate the parameters of the untied triphone model set used in data-driven clust...
متن کاملTriphone State-Tying via Deep Canonical Correlation Analysis
Context-dependent phone models are used in modern speech recognition systems to account for co-articulation effects. Due to the vast number of possible context-dependent phones, statetying is typically used to reduce the number of target classes for acoustic modeling. We propose a novel approach for state-tying which is completely data dependent and requires no domain knowledge. Our method firs...
متن کاملSpeech Recognition Using Monophone and Triphone Based Continuous Density Hidden Markov Models
Speech Recognition is a process of transcribing speech to text. Phoneme based modeling is used where in each phoneme is represented by Continuous Density Hidden Markov Model. Mel Frequency Cepstral Coefficients (MFCC) are extracted from speech signal, delta and double-delta features representing the temporal rate of change of features are added which considerably improves the recognition accura...
متن کاملMonte Carlo Simulation to Compare Markovian and Neural Network Models for Reliability Assessment in Multiple AGV Manufacturing System
We compare two approaches for a Markovian model in flexible manufacturing systems (FMSs) using Monte Carlo simulation. The model which is a development of Fazlollahtabar and Saidi-Mehrabad (2013), considers two features of automated flexible manufacturing systems equipped with automated guided vehicle (AGV) namely, the reliability of machines and the reliability of AGVs in a multiple AGV jobsho...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011